Freebase (database)
   HOME

TheInfoList



OR:

Freebase was a large collaborative
knowledge base A knowledge base (KB) is a technology used to store complex structured and unstructured information used by a computer system. The initial use of the term was in connection with expert systems, which were the first knowledge-based systems. ...
consisting of
data In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted ...
composed mainly by its
community A community is a social unit (a group of living things) with commonality such as place, norms, religion, values, customs, or identity. Communities may share a sense of place situated in a given geographical area (e.g. a country, village, tow ...
members. It was an online collection of structured data harvested from many sources, including individual, user-submitted
wiki A wiki ( ) is an online hypertext publication collaboratively edited and managed by its own audience, using a web browser. A typical wiki contains multiple pages for the subjects or scope of the project, and could be either open to the pu ...
contributions. Freebase aimed to create a global resource that allowed people (and machines) to access common information more effectively. It was developed by the American software company
Metaweb Metaweb Technologies, Inc. was a San Francisco-based company that developed Freebase, described as an "open, shared database of the world's knowledge". The company was co-founded by Danny Hillis, Veda Hlubinka-Cook and John Giannandrea in 2005 ...
and run publicly beginning in March 2007. Metaweb was acquired by
Google Google LLC () is an American multinational technology company focusing on search engine technology, online advertising, cloud computing, computer software, quantum computing, e-commerce, artificial intelligence, and consumer electronics. ...
in a private sale announced on 16 July 2010. Google's
Knowledge Graph The Google Knowledge Graph is a knowledge base from which Google serves relevant information in an infobox beside its search results. This allows the user to see the answer in a glance. The data is generated automatically from a variety of so ...
is powered in part by Freebase. During its existence, Freebase data was available for
commercial Commercial may refer to: * a dose of advertising conveyed through media (such as - for example - radio or television) ** Radio advertisement ** Television advertisement * (adjective for:) commerce, a system of voluntary exchange of products and s ...
and
non-commercial A non-commercial (also spelled noncommercial) activity is an activity that does not, in some sense, involve commerce, at least relative to similar activities that do have a commercial objective or emphasis. For example, advertising-free community ...
use under a
Creative Commons Attribution License A Creative Commons (CC) license is one of several public copyright licenses that enable the free distribution of an otherwise copyrighted "work".A "work" is any creative material made by a person. A painting, a graphic, a book, a song/lyric ...
, and an open
API An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how ...
, RDF endpoint, and a database dump is provided for programmers. On 16 December 2014, Google announced that it would shut down Freebase over the succeeding six months and help with the move of the data from Freebase to
Wikidata Wikidata is a collaboratively edited multilingual knowledge graph hosted by the Wikimedia Foundation. It is a common source of open data that Wikimedia projects such as Wikipedia, and anyone else, can use under the CC0 public domain license. ...
. On 16 December 2015, Google officially announced the Knowledge Graph API, which is meant to be a replacement to the Freebase API. Freebase.com was officially shut down on 2 May 2016. Both Graphd and MQL, the graph database and JSON-based query language developed by Metaweb for Freebase, are open-sourced by Google under the Apache 2.0 license, and are available on GitHub. Graphd is open-sourced on September 8, 2018. MQL is open-sourced on August 4, 2020.


Overview

On 3 March 2007 Metaweb announced Freebase, describing it as "an open shared database of the world's knowledge", and "a massive, collaboratively edited database of cross-linked data". Often understood as a database model using Wikipedia-turned-database or entity-relationship model, Freebase provided an interface that allowed non-programmers to fill in structured data, or
metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
, of general information and to categorize or connect data items in meaningful,
semantic Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy, linguistics and comput ...
ways. Described by
Tim O'Reilly Tim O'Reilly (born 6 June 1954) is the founder of O'Reilly Media (formerly O'Reilly & Associates). He popularised the terms open source and Web 2.0. Education and early life Born in County Cork, Ireland, Tim O'Reilly moved to San Francisco, C ...
upon the launch, "Freebase is the bridge between the bottom up vision of
Web 2.0 Web 2.0 (also known as participative (or participatory) web and social web) refers to websites that emphasize user-generated content, ease of use, participatory culture and interoperability (i.e., compatibility with other products, systems, and ...
collective intelligence Collective intelligence (CI) is shared or group intelligence (GI) that emerges from the collaboration, collective efforts, and competition of many individuals and appears in consensus decision making. The term appears in sociobiology, politic ...
and the more structured world of the semantic web". Freebase contained data harvested from sources such as
Wikipedia Wikipedia is a multilingual free online encyclopedia written and maintained by a community of volunteers, known as Wikipedians, through open collaboration and using a wiki-based editing system. Wikipedia is the largest and most-read refer ...
,
NNDB The Notable Names Database (NNDB) is an online database of biographical details of over 40,000 people. Soylent Communications, a sole proprietorship that also hosted the now-defunct Rotten.com, describes NNDB as an "intelligence aggregator" of n ...
,
Fashion Model Directory The Fashion Model Directory (FMD) is an online database of information about fashion models, modelling agencies, fashion labels, fashion magazines, fashion designers, and fashion editorials. FMD has been described as "the IMDb of the fashion ind ...
and
MusicBrainz MusicBrainz is a MetaBrainz project that aims to create a collaborative music database that is similar to the freedb project. MusicBrainz was founded in response to the restrictions placed on the Compact Disc Database (CDDB), a database for sof ...
, as well as data contributed by its users. The structured data was licensed under the
Creative Commons Attribution License A Creative Commons (CC) license is one of several public copyright licenses that enable the free distribution of an otherwise copyrighted "work".A "work" is any creative material made by a person. A painting, a graphic, a book, a song/lyric ...
, and a
JSON JSON (JavaScript Object Notation, pronounced ; also ) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other ser ...
-based
HTTP The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, ...
API is provided to programmers for developing applications on any platform to utilize the Freebase data. The
source code In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the wo ...
for the Metaweb application itself is proprietary. Freebase ran on a database infrastructure created in-house by Metaweb that use a
graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discre ...
model: Instead of using tables and keys to define data structures, Freebase defined its data structure as a set of
nodes In general, a node is a localized swelling (a "knot") or a point of intersection (a Vertex (graph theory), vertex). Node may refer to: In mathematics *Vertex (graph theory), a vertex in a mathematical graph *Vertex (geometry), a point where two ...
and a set of links that established relationships between the nodes. Because its data structure was non-hierarchical, Freebase could model much more complex relationships between individual elements than a conventional
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases sp ...
, and was open for users to enter new objects and relationships into the underlying graph. Queries to the database are made in Metaweb Query Language (MQL) and served by a
triplestore A triplestore or RDF store is a purpose-built database for the storage and retrieval of triples through semantic queries. A triple is a data entity composed of subject–predicate–object, like "Bob is 35" or "Bob knows Fred". Much like a relat ...
called graphd.


Development

Danny Hillis William Daniel "Danny" Hillis (born September 25, 1956) is an American inventor, entrepreneur, and computer scientist, who pioneered parallel computers and their use in artificial intelligence. He founded Thinking Machines Corporation, a paralle ...
first described his idea for creating a knowledge web he called Aristotle in a paper in 2000, but he said he did not try to build the system until he had recruited technical experts.
Veda Hlubinka-Cook Veda Hlubinka-Cook (born Robert Cook, on December 26, 1964) is an American programmer and co-founder of Metaweb. The company was acquired by Google in 2010. She was a video game programmer at Broderbund in the 1980s. She designed and wrote the ga ...
, an expert in parallel computing, became Metaweb's Executive Vice President for Product.
Kurt Bollacker Kurt Bollacker is an American computer scientist with a research background in the areas of machine learning, digital libraries, semantic networks, and electro-cardiographic modeling. He received a Ph.D. in Computer Engineering from The Universi ...
brought deep expertise in distributed systems, database design, and information retrieval to his role as Chief Scientist at Metaweb.
John Giannandrea John Giannandrea is a Scottish software engineer and businessman. He co-founded Metaweb, led Google Search and artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information†...
, formerly Chief Technologist at Tellme Networks and Chief Technologist of the Web browser group at
Netscape Netscape Communications Corporation (originally Mosaic Communications Corporation) was an American independent computer services company with headquarters in Mountain View, California and then Dulles, Virginia. Its Netscape web browser was onc ...
/AOL, was Chief Technology Officer. Originally accessible by invitation only, Freebase opened full anonymous read access to the public in its
alpha Alpha (uppercase , lowercase ; grc, ἄλφα, ''álpha'', or ell, άλφα, álfa) is the first letter of the Greek alphabet. In the system of Greek numerals, it has a value of one. Alpha is derived from the Phoenician letter aleph , whic ...
stage of development and later required registration only for data contributions. On 29 October 2008, at the
International Semantic Web Conference The International Semantic Web Conference (ISWC) is a series of academic conferences and the premier international forum for the Semantic Web, Linked Data and Knowledge Graph Community. Here, scientists, industry specialists, and practitioners ...
2008, Freebase released its RDF service for generating RDF representations of Freebase topics, allowing Freebase to be used as
linked data In computing, linked data (often capitalized as Linked Data) is structured data which is interlinked with other data so it becomes more useful through semantic queries. It builds upon standard Web technologies such as HTTP, RDF and URIs, but ...
.


Organization and policy

Freebase's subjects are called "topics", and the data stored about them depended on their "type", as to how they were classified. For example, an entry for
Arnold Schwarzenegger Arnold Alois Schwarzenegger (born July 30, 1947) is an Austrian and American actor, film producer, businessman, retired professional bodybuilder and politician who served as the 38th governor of California between 2003 and 2011. ''Time'' ...
, the former governor of California, would be entered as a topic that would include a variety of types describing him as an actor, bodybuilder, and politician. , Freebase had approximately 44 million topics and 2.4 billion facts. Freebase's types are themselves user-editable. Each type had a number of defined predicates, called "properties".
like the
W3C The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working to ...
approach to the semantic web, which starts with controlled ontologies, Metaweb adopts a
folksonomy Folksonomy is a classification system in which end users apply public tags to online items, typically to make those items easier for themselves or others to find later. Over time, this can give rise to a classification system based on those tags ...
approach, in which people can add new categories (much like tags), in a messy sprawl of potentially overlapping assertions.
However, Freebase differed from the wiki model in many ways. User-created types were not adopted in the "public commons" until promoted by a Metaweb employee. Also, users could not modify each other's types. The reason Freebase could not open up permissions of schemas is that external applications relied on them; thus, changing a type's schema – for instance by deleting a property or changing a simple property – might have broken queries for API users and even within Freebase itself, for example in saved views.


Discontinuation

On 16 December 2014, the Freebase team officially announced that the website and the
application programming interface An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how t ...
would be shut down by 30 June 2015. Google provided an update on 16 December 2015 that they would discontinue the Freebase API and widget three months after a Suggest widget replacement was launched in early 2016.


See also

*
BabelNet BabelNet is a multilingual lexicalized semantic network and ontology developed at the NLP group of the Sapienza University of Rome.R. Navigli and S. P Ponzetto. 2012BabelNet: The Automatic Construction, Evaluation and Application of a Wide-Cove ...
*
Cyc Cyc (pronounced ) is a long-term artificial intelligence project that aims to assemble a comprehensive ontology and knowledge base that spans the basic concepts and rules about how the world works. Hoping to capture common sense knowledge, Cyc f ...
*
DBpedia DBpedia (from "DB" for "database") is a project aiming to extract structured content from the information created in the Wikipedia project. This structured information is made available on the World Wide Web. DBpedia allows users to semantica ...
*
Entity–relationship model An entity–relationship model (or ER model) describes interrelated things of interest in a specific domain of knowledge. A basic ER model is composed of entity types (which classify the things of interest) and specifies relationships that can ex ...
*
True Knowledge Evi (formerly True Knowledge) is a technology company in Cambridge, England, founded by William Tunstall-Pedoe,
* YAGO * Knowledge Vault *
Wikidata Wikidata is a collaboratively edited multilingual knowledge graph hosted by the Wikimedia Foundation. It is a common source of open data that Wikimedia projects such as Wikipedia, and anyone else, can use under the CC0 public domain license. ...


References


External links

{{DEFAULTSORT:Freebase (Database) Semantic wikis Knowledge bases Creative Commons-licensed databases Google Search Internet properties established in 2007